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METHOD FOR IMPROVING DISK MIRRORING ERROR RECOVERY IN A COMPUTER SYSTEM 
INCLUDING AN ALTERNATE COWUNICATION PATH 



SPECIFICATION 

To all whom it may concerns 

Be it known that Richard Rollins, Michael 
Ohran, Randall C. Johnson, Scott Bonsteel, and 
Richard S. Ohran, citizens of the United States of 
America, have invented a new and useful invention 
entitled METHOD FOR IMPROVING ERROR RECOVERY 
PERFORMANCE IN A FAULT-TOLERANT COMPUTER SYSTEM of 
Which the following comprises a complete 
specification . 
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1 MgCHDD FOR IMPROVIMG ERROR RECOVBRY PERFORMMICE 

2 IN A FAm.T-Tnr.KRAWT mMPIlTER SYSTEM 

3 

4 Microfiche Aop gwdlir- This specification includes a 

5 Microfiche Appendix which includes 1 page of 

6 microfiche and a total of 13 frames. The 

7 Microfiche Appendix includes computer source code 

8 illustrative of one preferred embodiment of the 

9 present invention. 
:o 

.1 Background of the Inventlcm 

-2 Field of the Invention . This invention relates to 

-3 fault-tolerant computer systems, and in particular 

■■4 to the methods used to recover from a computer 

-5 failure in a system with redundant computers each 

•6 with its own mass storage syBtem(s). 

:7 DeBcriptj,9n j^^jated Art. It is often desirable 

i-8 to provide continuous operation of computer 

-9 systems, particularly file servers tdiich support a 

'0 number of user workstations or personal computers 

:i on a network. To achieve this continuous 

-2- 
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Operation, it is necessary for the computer systeiR 
to be tolerant of software and hardware problems or 
faults. This is generally done by having redundant 
computers and redundant mass storage systems, such 
that a backup computer or disk drive is immediately 
available to take over in the event of a fault. 

A number of techniques for implementing a 
fault-tolerant computer system are described in 
Major et al.. United States Patent 5,157,663, which 
is hereby incorporated by reference in its 
entirety, and Hajor's cited references. In 
particular, the invention of Major provides a' 
replicated network file server capable of 
recovering from the failure of either the computer 
or the mass storage system of one of the two file 
servers. It has been used by Novell to implement 
its SFT-III fault-tolerant file server product. 

Figure 1 illustrates the hardware 
configuration for a fault-tolerant computer system 
100, such as described in Major. There are two 
server computer systems 110 and 120 connected to 
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1 network 101, from which they receive requests from 

2 client computers. While we refer to computers 110 

3 and 120 as "server computer systems" or simply 

4 "servers" and show them in that role in the 

5 examples herein, this should not be regarded as 

6 limiting the present invention to computers used 

7 only as servers for other computer systems. 

8 Server computer system 110 has computer 

9 111 which includes a central processing unit and 
-0 appropriate memory systems and other peripherals. 
.1 Server computer system 120 has computer 121 which 
.2 includes a central processing unit and appropriate 
-3 memory systems and other peripherals. Mass storage 
-4 systems 112 and 113 are connected to computer 111, 
■5 and mass storage systems 122 and 123 are connected 
-6 to computer 121. Mass storage systems 112 and 123 
-7 are optional devices for storing operating system 

.8 routines and other data not associated with read 

.9 and write requests received from network 101. 

'.0 Finally, there is an optional communications link 

11 131 between computers 111 and 121. 

-4- 
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The mass storage systems can be 
implemented using magnetic disk drives, optical 
discs, magnetic tape drives, or any other medivim 
capable of handling the read and write requests of 
the particular computer system. 

An operating system or other control 
program runs on server computer systems 110 and 
120, executed by computers 111 and 121, 
respectively. This operating system handles server 
recpiests received from network 101 and controls 
mass storage systems 112 and 113 on server 110, and 
mass storage systsms 122 and 123 on server 120, as 
well as any other peripherals attached to computers 
111 and 121. 

While Figure 1 illustrates only two 
server computer systems 110 and 120, because that 
is the most common {and lowest cost) configuration 
for a fault-tolerant computer system 100, 
configurations with more than two server computer 
systems are possible and do not depart from the 
spirit and scope of the present invention. 
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1 In normal operation, both server co^nputer 

2 system 110 and server computer system 120 handle 

3 each mass storage write request received from 

4 network 101. Server computer system 110 writes the 

5 data from the network request to mass storage 

6 system 113/ and server computer system 120 writes 

7 the data from the network request to mass storage 

8 system 122. This results in the data on mass 

9 storage system 122 being the mirror image of the 
.0 data on mass storage system 113 and the states of 
.1 server computer systons 110 and 120 are generally 
,Z consistent. In the following discussion, the 

■3 process of maintaining two or more identical copies 

.4 of infonaation on separate mass storage systems is 

.5 referred to as "mirroring the information" . 
.6 (For read operations, either server 

-7 computer system 110 or server computer system 120 

-8 can handle the request without involving the other 

.9 server, since a read operation does not change the 

'.0 state of the information stored on the mass storage 

:l systems . ) 

-6- 
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^ Although ccsnputer system 100 provides a 

2 substantial degree of fault tolerance, when one of 

3 server computer systems 110 or 120 fails, the fault 

4 tolerance of the system is reduced, in the most 

5 cominon case of two server computer systems, as 

6 illustrated by Figure 1, the failure of one server 

7 computer system results in a system with no further 

8 tolerance to hardware faults or many software 

9 faults . 

.0 In a fault-tolerant computer system such 

-1 as described above, it is necessary after a failed 

.2 server computer system has been restored to bring 

.3 the previously-failed computer system into a state 

-4 consistent with the server computer system that has 

-5 continued operating. This requires writing all the 

.6 changes made to the mass storage system of the non- 

.7 failing server to the mass storage system of the 

.8 previously-failed server so that the mass storage 

.9 systems again mirror each other, until that has 

^0 been accomplished, the system is not fault tolerant 

li even though the failed server has been restored. 

-7- 
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1 If a server has been unavailable due to 

2 Its failure for a period of time during which there 

3 have been only a limited number of changes made to 

4 the mass storage S3^tem of the non-failing server, 

5 it is possible for the non-failing server to 

6 remember all the changes made (for sample, by 

7 keeping them in a list stored in its memory) and 

a forward the changes to the previously-failed seirver 

9 when it has been restored to operation. The 

0 previously-failed server can then update its mass 

1 storage system with the changes and make it 

2 consistent with the non-failing server. This 

3 process typically does not cause excessive 

4 performance degradation to the non-failing server 

5 for any substantial period of time. 

6 However, if there have been more changes 

7 than can be conveniently remontibered by the non- 

8 failing server, then the non-failing server must 

9 transfer all the information from its mass storage 

0 system to the previously-failed server for writing 

1 on its mass storage system in order to ensure that 

-8- 
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'1 the two servers are consistent. This is a very 

2 time consuming and resource-intensive operation, 

3 especially if the non-failing server must also 

4 handle server requests from the network ^lle this 

5 transfer is taking place. For very large mass 

6 storage systems , as would be found on servers 

7 commonly in use today, and with a reasonably high 

e network request load, it might be many hours before 

9 the mass storage systems are again consistent and 

10 the system is again fault tolerant. Additionally, 

Ll the resource-intensiveness of the recovery 

12 operation can cause very substantial performance 

1-3 degradation of the non-failed server in processing 

u network requests. 

15 Summary of the Invention 

1-6 It is an object of the present invention 

17 to provide tolerance to disk faults even though the 

18 ecanputer of a server computer system has failed. 

19 This is achieved by electronically switching the 

10 mass storage system used for network requests from 

11 the failed server computer system to the non- 

-9- 
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1 failing server computer system. After the mass 

2 storage system from the failed server computer 

3 system has been connected to the non-failing 

4 server's computer, it is made consistent with the 

5 mass storage system of the non-failing server. 

6 This is typically a quick and simple operation. 

7 From that point on, the mass storage system from 

8 the failed server it is operated as a mirrored disk 

9 system, with each change being written by the non- 
.0 failing server's con^uter to both the non-failing 
.1 server's original mass storage system and to the 

-2 mass storage system previously on the failed server 

.3 and now connected to the non-failing server's- 

-4 computer. 

■5 While operating in this mode, the system 

-6 will no longer be tolerant to processor failures if 

-7 the non-failing server is the only remaining server 

■8 (as would be the case in the common two-server 

■9 configuration described above } , but the system 

:o would be tolerant to failures of one of the mass 

:i storage systems. 

-10- 
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1 It is a further object of the present 

2 Invention to minimize the time the system is not 

3 fault tolerant by eliminating the need for time- 

4 consuming copying of the information stored on the 

5 mass storage system of the non-failing server to 

6 the mass storage of the previously-failed server to 

7 make the two mass storage systems again consistent 

8 and permit mirroring of information again. 

9 This is also achieved by electronically 

10 switching the mass storage system from the failed 

11 server computer system to the non-failing server 
IZ computer system. If this switch is accomplished 
L3 after there have been only a small number of , 

:.4 changes to the mass storage system of the non- 
15 failing server, the mass storage system from the 

16 failed server computer system can be quickly 

17 updated and made consistent, allowing mirroring to 
LB resume . 

19 Furthermore, since the mirroring of the 

^0 invention keeps the information on the mass storage 

il system from the failed server consistent while it 

-11- 



wo 95/00906 



PCT/US94/07009 



1 is connected to the non-failing sever computer 

2 system, when the mass storage system is reconnected 

3 to the previously-failed server only those changes 

4 made between the time it was disconnected from the 

5 non-failed server and \Th.&a. it becomes availiible on 

6 the previously-failed server need to be made before 

7 it is again completely consistent and mirroring by 
6 -the two servers (and full fault tolerance) resumes. 
9 This results in avoiding the substantial 

.0 performance degradation experienced by the non- 

,1 failing server during recovery using the prior airt 

.2 recovery method described above. As a result, the 

.3 invention provides rapid recovery from a fault in 

.4 the system. 

.5 These and other features of the invention 

.6 will be more readily understood upon consideration 

.7 of the attached drawings and of the following 

.8 detailed description of those drawings and the 

.9 presently preferred embodiments of the invention. 
>0 pyjLef Description of the Drqyj.iws. 
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1 Figure 1 Illustrates a prior art 

2 implementation of a fault-tolerant computer system 

3 with two server computer systems. 

4 Figure 2 illustrates the fault-tolerant 

5 conqputer system of Figure 1, modified to permit the 

6 method of the invention by including means for 

7 connecting a mass storage system to either server's 

8 computer. 

9 Figure 3 is a flow diagram illustrating 
10 the steps to be taken when a processor failure is 
Li detected. 

12 Figure 4 is a flow diagram Illustrating 

L3 the steps to be taken when the previously-failed 

L4 processor becomes available. 

15 Detailed Description of the Invention 

16 Referring to fault-tolerant computer 

17 system 200 of Figure 2, and comparing it to prior 

18 art fault-tolerant computer system 100 as 

19 illustrated in Figure 1, we see that mass storage 

20 systems 113 and 122, which weire used for storing 
^i the information read or written in response to 

-13- 
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1 requests from other computer systems on network 

2 101/ are now part of reconfigurable mass storage 

3 system 240. In particular, mass storage system 113 

4 can be selectively connected by connection means 

5 241 to either computer 111 or computer 121 (or 

6 possibly both computers 111 and 121, although such 

7 dual connection is not necessary for the present 

8 invention), and mass storage system 122 can 

9 likewise be independently selectively connected to 
^0 either computer 111 or computer 121 by connection 
.1 means 241. The mass storage system 240 is 

.2 reconfigurable because of the ability to select and 

-3 change connections between mass storage devices and 

.4 computers. 

■5 While Figure 2 illustrates the most 

-6 common dual server configuration anticipated by the 

.7 inventors, other configurations with more than two 

.8 servers are within the scope of the present 

-9 invention, and the extension of the techniques 

10 described below to other configurations will be 

11 obvious to one skilled in the art. 

-14- 
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1 There are a number of ways such 

2 coimection means 241 can be Implemented, depending 

3 on the nature of the mass storage system interface 

4 to computers 111 or 121. Connection means 241 can 

5 be two independent two-channel switches , which 

6 electronically connect all the interface signals 

7 from a mass storage system to two computers. Such 

8 two-channel switches may be a part of the mass 

9 storage system (as is common for mass storage 

■0 systems intended for use with mainframe computers) 

.1 or can be a separate unit. A disadvantage of using 

.2 two-channel switches is the large number of 

-3 switching gates that are necessary if the number of 

-4 data and control lines in the mass storage 

-5 interface is large. That number increases rapidly 

.6 when there are more than two server computer 

-7 systems in fault -tolerant counter system 200. For 

-6 example, a fault-tolerant conqouter system with 

■9 three computers connected to three mass storage 

'0 systems would require 2.25 times the number of 

-i switching gates as the system illustrated in Figure 

-15- 
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1 2. (The number of switching gates is proportional 

2 to the number of computers times the number of mass 

3 storage systems . ) The number of switching gates 

4 can be reduced by not connecting every mass storage 

5 system to every compnter, although such a 

6 configuration would be less flexible in its 

7 reconfiguration ability. 

8 Another implementation of connection 

9 means 241 is for both computer 111 and computer 121 
.0 to have interfaces to a common bus to which mass 

:1 storage systems 113 and 122 are also connected. An 

L2 example of such a bus is the small computer system 

.3 interface (SCSI) as used on many workstations and 

A personal computers. When a computer wishes to 

.5 access a mass storage system, the computer requests 

16 ownership of the bus through an appropriate bus 

L7 arbitration procedure, and when ownership is 

L8 granted, the computer performs the desired mass 

19 storage operation. A disadvantage of this 

20 implementation is that only one computer (the one 
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1. with current bus ownership) can access a mass 

2 Storage system at a time. 

z If it is desirable to use a standard SCSI 

4 bus as means 241 for connecting mass storage 

5 systems 113 and 122 to computers 111 and 121, and 

6 to allow simultaneous access of the mass storage 

7 systems 113 emd 122 by their respective server's 

8 computers, computers 111 and 121 can each have two 

9 SCSI interfaces, one connected to mass storage 

0 system 113 and one connected to mass storage system 

1 122. HasE storage system 113 will be on a SCSI bus 
.2 connected to both computers 111 and 121, and mass 

.1 storage system 122 will be on a second SCSI bus, 

4 also connected to both computers 111 and 121. If 

5 computer 111 or cooiputer 121 is not using a 

.6 particular mass storage systwn, it will configure 

.7 its SCSI interface to be inactive on that mass 

.8 storage systems particular bus. 

.9 In the preferred embodiment, a high-speed 

.0 serial network between computers 111 and 121 and 

1' mass storage systems 113 and 122 forms connection 

-17- 
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1 means 241. Each computer 111 contains an interface 

2 to the network, and requests to a mass storage 

3 system 113 or 122 are routed to the appropriate 

4 network interface serving the particular mass 

5 storage system. Although a bus -type network, such 

6 as an Ethernet, could be used, the network of the 

7 preferred embodiment has network nodes at each 

8 computer and at each mass storage system. Each 

9 node can be connected to up to four other network 

-0 nodes. A message is routed by each network node to 

.1 a next network node closer to the message's final 

12 destination . 

^3 For the fault-tolerant cosifniter system 

-4 configuration of Figure 2, one network connection 

:.5 from the node at computer 111 is connected to the 

-6 node for mass storage system 113, and another 

S.7 network connection from the node at computer 111 is 

L8 connected to the node for mass storage system 122. 

19 Similar connections are used for computer 121. 

!0 Mass storage system 113 's node is connected 

^1 directly to the nodes for computers 111 and 121, 

-18- 
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1. and mass storage system 122 's node is similarly 

2 connected (but with different links) to computers 

3 111 and 121. Routing of messages is trivial, since 

4 there is only one link between each computer and 

5 each mass storage system. 

6 The particular connecting means 241 used 

7 to connect computers 111 and 121 to mass storage 

8 systems 113 and 122 is not critical to the method 

9 of the present invention, so long as it provides 

10 for the rapid switching of a mass storage system 
U from one computer to another without affecting the 
12 operation of the computers. Any such means for 

1.3 connecting a mass storage system to two or more 

14 computers is usable by the method of the present 

15 invention. 

16 The method of the present invention is 

17 divided into two portions, a first portion for 

18 reacting to a processor failure and a second 

19 portion for recovering from a processor failure. 

20 The first portion of the method of the present 

11 invention is illustrated by Figure 3, which is a 

-19- 
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1 flow diagram illustrating the steps to be taken 

2 when a processor failure Is detected. The 

3 description of the method provided below should be 

4 read in light of Figure 2. For purposes of 

5 illustration, it will be assumed that connection 

6 means 241 initially connects mass storage system 

7 113 to computer 111 and mass storage system 122 to 

8 computer 121, providing an equivalent to the 

9 configuration illustrated in Figure 1 although the 

10 connection means 241 of Figure 2 facilitates this 

-1 equivalent configuration. Information mirroring as 

12 described above is being performed by computers 111 

13 and 122. It is also assumed that computer 121 has 
i.4 experienced a fault, causing server computer system 
:5 120 to fail. 

L6 The method starts in step 301, with each 

17 computer 111 and 122 waiting to detect a failure of 

L8 another server's computer 111 and 122. Such 

19 failure can be detected by probing the status of 

iO the other server's computer by a means appropriate 

^1 to the particular operating system being used and 

-20- 
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the communications methods between the servers. In 
the case of Novell's SFT-III, the method will be 
running as a NetWare Loadable Module, or HLK, and 
be capable of communicating directly with the 
operating systan by means of requests. The KUi 
will make a null request to the SPT-III process. 
This null request will be such that it will never 
normally run to completion, but will remain in the 
SFT-III proceiss queue, (it will require minimal 
resources i^ile it remains in the process queue.) 
In the event of a failure of server computer system 
121, SPT-III running on server computer system 111 
will indicate the failure of the null request to 
the NLM of the method, indicating the failure of 
server 121. Because a processor failure has been 
detected, the method depicted in Figure 3 proceeds 
to step 302. 

In step 302, detection of the failure of 
server 121 causes the discontinuation of mirroring 
information on the failed server 121. This 
discontinuation can either be done automatically by 

-21- 
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1 the operating system upon its detection of the 

2 failure of seirver 121, or by the particuleu: 

3 implementation of the preferred embodiment of the 

4 method of the present invention. In the case, of 

5 SFT-III, the discontinuation of mirroring on server 

6 121 is performed by the SFT-III operating system. 

7 Step 303 of the method is performed next. 

8 In step 303, SFT-III remembers all data 

9 not mirrored on server 121 following its failure as 
.0 long as the amount of data to be remembered does 

.1 not exceed the capacity of the system resource 

.2 remembering the data. If the particular operating 

.3 system does not remember non-mirrored data, step 

.4 303 would have to be performed by the pairticular 

.5 implementation of the method of the present 

.6 invention. The step of remembering all non- 

.7 mirrored data could be performed by any technique 

.8 known to persons skilled in the art. 

.9 Next, step 304 of the method sets 

0 connection means 241 to disconnect mass storage 

.1 system 122 frcm computer 121 of failed server 120, 

-22- 
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and to connect it to computer 111 of non-failing 
server 110. At this point, the method can quickly 
test mass storage system 122 to determine if it is 
the cause of the failure of server 120. If it is, • 
there is no fault-tolerance recovery possible using 
the method, and mass storage system 122 can be 
disconnected from computer 111 at connection means 
241. If mass storage system 122 is not the cause 
of server 120 's failure, then the cause must be 
con^ter 121, and the method can continue to 
achieve limited fault tolerance in the presence of 
the computer 121 's failure. 

Step 305 commands the operating system of 
server 110 to scan for new mass storage systems, 
causing the operating system to determine that mass 
storage system 122 is now connected to computer 
111, along with mass storage system 113. SPT-III 
will detect through information on mass storage 
systems 113 and 122 that they contain similar 
information, but that mass storage system 122 is 
not consistent with mass storage systau 113. In 

-23- 
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Step 306, SFT-IXI will update ittass storage system 
122 using the information remembered at step 303 
and, after the two mass storage systems are 
consistent (i.e., contain identical mirrored copies 
of the stored information), step 307 will begin 
mirroring all information on both mass storage 
systems 113 and 122 and resume normal operation of 
the system. If an operating system different than 
SFT-III does not provide this automatic update for 
consistency and mirroring, the implementation of 
the method will have to provide an equivalent 
service. 

Note that when SPT-III is used, the only 
steps of the method that must be performed by the 
NETWARE loadable module eure: (1) detecting the 
failure of server 120 (step 301), (2) setting 
communications means 241 to disconnect mass storage 
system 122 from computer 121 and connecting it to 
computer 111 (step 304), (3) determining if mass 
storage system 122 was the cause of the failure of 
server 120 (also part of step (304), and (4) 

-24- 
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1 commanding SFT-III to scan for mass storage systems 

2 BO that It finds the newly-connected mass storage 

3 system 122 (step 305). All the other steps are 

4 performed as part of the standard facilities of 

5 SFT-III. In other emJxxliments of the invention, 

6 responsibility for performing the steps of the 

7 method may be allocated differently. 

8 Figure 4 is a flow diagram illustrating 

9 the second portion of the invention - the steps to 
-0 be taken when previously-failed server 120 becomes 
-1 available again. Server 120 would typically became 
-2 available after correction of the problem that 

■3 caused its failure described above. Step 401 

determines that server 102 is available and the 

•5 method proceeds to step 402. In step 402, the 

-6 method sets connection means 241 to disconnect mass 

-7 storage system 122 from coinputer 111 after 

-B commanding SFT-III on server 110 to remove mass 

-9 Storage system 122 from its active mass storage 

'0 systems. Due to the unavailability of mass storage 

'1' system 122 on server 110, data mirroring on server 

-25- 
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110 Will be stopped by SFT-III and it will begin 
remembering changes to mass storage system 113 not 
made to mass storage system 122 to be used in 
making the storage systems consistent later. 

In step 403, mass storage system 122 is 
reconnected to computer 121, and in step 404, SFT- 
III on server 120 is commanded to scan for the 
newly-connected mass storage system 122. This 
returns mass storage system 122 to the computer 121 
to which it was originally connected prior to a 
server failure. When SFT-III on server 120 detects 
mass storage system 122, it communicates with 
server 110 over link 131. At this point, the 
operating systems on servers 110 and 120 work 
together to make mass storage system 122 again 
consistent with mass storage system 113 (i.e.-/ by 
remembering interim changes to mass storage syst^t 
113 and writing them to mass storage system 122), 
and when consistency is achieved, data mirroring on 
the two servers resumes. At this point, recovery 
from the server failure is complete. 
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In an SF7-III system, the only steps of 
the method that the NetWare Loadable Module must 
perform arej (1) detecting the availability of 
server 120 (step 401), (2) removing mass storage 
system 122 from the operating system on server 110 
(step 402), (3) disconnecting mass storage system 
122 from computer 111 and connecting it to computer 

121 by setting connection meems 241 (step 403), and 
(4) commanding SFT-III on server 120 to scan for 
mass storage so that it locates mass storage system 

122 (step 404). The steps involved with making 
mass storage systems 113 and 122 consistent and 
reestablishing data mirroring (step 405) are 
performed as part of the standard facilities of 
SrT-III. In other embodiments of the invention, 
responsibility for the steps of the method may be 
allocated differently. 

Figure 2 Illustrates optional mass 
storage systems 112 and 123 attached to computers 
111 and 121, respectively. While these two mass 
storage systems are not required by the method of 
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the present invention, they are useful during the 
restoration of a failed server. They provide 
storage for the operating system and other 
information needed by failed server 120 to begin 
operation before mass storage system 122 is 
switched from computer 111 to computer 121. Weire 
mass storage system 123 not available, some means 
of having mass storage system 122 connected both to 
computer 121 (for initializing its operation 
following correction of its failure) and computer 
111 (for continued disk mirroring) would be 
necessary. Alternatively, if the initialization 
time of server 120 is short, mass storage system 
122 could be switched from computer 111 to computer 
121 at the start of server 120 's initialization, 
though this would result in more changes that must 
be remembered and made before data mirroring can 
begin again. 

It is to be understood that the above 
described embodiments are merely illustrative of 
numerous and varied other embodiments which may 
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1 constitute applications of the principles of the 

2 invention. Such other embodiments may be readily 

3 devised by those skilled in the art without 

4 departing from the spirit or scope of this 

5 invention and it is our intent they be deemed 

6 within the scope of our invention. 

7 
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2 Claims 

3 We claimt 

4 1. A method for rapid failure recovery and 

5 system restoration in a fault-tolerant computer 

6 system, said computer system comprising: 

7 (A) a first server computer system, 

8 comprising a first computer executing an 

9 operating system; 

0 (B) a second server computer system, 

.1 coii^rising a second computer executing an 

.2 operating system; 



.3 (C) a first mass storage system connected to 

4 said first computer; 

.5 (13) a second mass storage system; and 

.6 (E) means for connecting said second mass 

.7 storage system to said first con^uter and to 

.8 said second computer; 

.9 WHEREIN \rtienever said first computer writes 
10 data to said first mass storage system, said second 
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1 computer writes a mirror copy of said data to said 

2 second mass storage system, 

3 the method comprising the steps of j 

4 (1) detecting a failure of said second 

5 computer; 

6 (2) discontinuing causing said writing of 

7 said mirror copy on said second mass storage 

8 system; 

9 (3) remembering data written to said first 
10 mass storage system but not written to said 
U second mass storage system; 

(4) configuring said second mass storage 

L3 system to record information from said first 

u computer; 

1-5 (5) writing said remembered data to said 

IS second mass storage syston; 

1^7 (6) whenever new data is written to said 

18 first mass storage system, writing a mirror 

L9 copy of said new data to said second mass 

10 storage system; 
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1 (7) detecting said second computer's 

2 availability; 

3 (8) reconfiguring said second mass storage 

4 system to record information from said second 

5 computer; 

6 (9) reestablishing data mirroring such that 

7 whenever said first computer writes data to 

8 said first mass storage system, said second 

9 computer writes a mirror copy of said data on 

0 said second mass storage system. 

1 2. A method as in claim 1 wherein step (1) is 

2 performed by said first computer. 

3 3. A method as in claim 2 wherein step (2) is 

4 performed by said first computer. 

5 4 . A method as in claim 1 wherein step { 3 ] is 

6 perfozmed by said first computer. 

7 5. A method as in claim 4 wherein step (5) is 

8 performed by said first computer. 

9 6. A method as in claim 5 wherein step (6) is 
0 performed by said first computer. 
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1 • 7 . A method as in claim 1 , wherein said first 

2 mass storage system and said second loass storage 

3 system each comprise at least one magnetic disk 
k drive . 

5 8. A method as in claim 1, ndierein said means 

6 for connecting said second mass storage system 

7 comprises a serial network. 

8 9. A method as in claim 1 wherein said operating 

9 systems are the SFT-III operating system. 

0 10. A method as in claim 9 wherein steps (l)r (4) 

1 and (5) are performed by a NETHARE loadable module. 
2 

3 11. A method for rapid failure recovery and 

4 syston restoration in a fault-tolerant computer 

5 system, said computer system comprising: 
-6 (A) a first server computer system, 

7 comprising a first computer executing an 

.8 operating system; 

■9 (B) a second server computer system, 

0 comprising a second computer executing an 

■1' operating system; 
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1 (C) a first mass storage system connected to 

2 said first computer; 

3 (D) a second mass storage system; and 

4 (E) means for selectively connecting said 

5 second mass storage system to said first 

6 computer and to said second computer; 

7 HEREIN in the absence of a fault said second 

8 mass storage system is connected to said second 

9 computer; and 

0 VIHEREIH whenever said first computer writes 

.1 data to said first mass storage system said first 

2 computer can also cause said second computer to 

3 write a mirror copy of said data to said second 

4 mass storage system, 

,5 the method of the invention comprising i 

■6 (1) on said first computer, detecting a 

.7 failure of said second computer; 

a (2) on said first computer, discontinuing 

.9 causing said writing of said mirror copy on 

■0 said second mass storage system by said 

.1 second conqputer; 
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1 (3) on said first computer, remembering data 

2 written to aaid f irat mass storage system but 

3 not written to said second mass storage 

4 system; 

5 (4) on said first computer, setting said 

6 means for connecting said second mass storage 

7 system to connect said second mass storage 

8 system to said first computer; 

9 (5) on said first computer, commanding said 
-0 operating system of said first computer to 

.1 scan for mass storage systems such that said 

-2 operating system of said first computer will 



.3 determine that both said first mass storage 

-4 system and said second mass storage system 

.5 are now connected to said first computer; 

-6 (6) on said first computer, writing said 

'.7 remembered data to said second mass storage 

18 system; 

-9 (7) on said first computer, whenever new data 
is written to said first mass storage system. 
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1 writing a mirror copy of said new data to 

z said second mass storage system; 

3 (8) on said first computer, detecting said 

4 second computer's availability; 

s (9) on said first computer, commanding said 

6 operating system of said first computer to 

7 remove said second mass storage system; 

8 (10) setting said means for connecting said 

9 second mass storage system to connect said 

10 second mass storage system to said second 

11 computer; 

12 (11) on said second computer, commanding 

13 said operating system of said second computer 
U to scan for mass storage systems such that 

said operating system of said second computer 

16 will determine that said second mass storage 

17 system is now connected to said second 

18 computer; 

19 (12) reestablishing data mirroring such that 
10 whenever said first computer writes data to 
21 said first mass storage system said first 
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1. computer also causes said second computer to 

2 write a mirror copy of said data on said 

3 second mass storage system. 

4 12. A method as in claim 11, tidierein said first 
s mass storage system and said second mass storage 

6 system each comprise at least one magnetic disk 

7 drive. 

8 13. A method as in claim 12, wherein said means 

9 for connecting said second mass storage system 
iO comprises a serial network. 

il 

L2 14. A method for rapid failure recovery in a 

L3 fault-tolerant computer system, said computer 

U system comprising: 

(A) a first server computer system, 
i>6 comprising a first computer executing an 

17 operating system; 

18 (B) a second server computer system, 

19 comprising a second computer; 

10 (C) a first mass storage system connected to 

zi said first computer; 
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1 (I^) ^ second mass storage system; and 

2 (E) means for selectively connecting said 

3 second mass storage system to said first 

4 computer and to said second computer; 

5 1IVHE31EIN in the absence of a fault said second 

6 mass storage system is connected to said second 

7 computer; and 

8 WHEREIN whenever said first computer vrrltes 

9 data to said first mass storage system said first 

10 computer can also cause said second computer to 

11 write a mirror copy of said data on said second 

12 mass storage system, 

13 the method of the invention ct^rising said first 
u computer performing the steps of: 

15 (1) detecting a failure of said second 

16 computer; 

17 (2) discontinuing causing said writing of 

18 said mirror copy on said second mass storage 

19 system by said second computer; 
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1 (3) remembering data written to said first 

2 mass storage system but not written to said 

3 second mass storage system; 

4 (4) setting said means for connecting said 

5 second mass storage system to connect said 

6 second mass storage system to said first 

7 computer; 

s (5) commanding said operating system of said 

9 first computer to scan for mass storage 

LO systems such that said operating system, of 

5.1 said first computer will determine that both 

12 said first mass storage system and said 

13 second mass storage system are now connected 
L4 to said first computer; 

13 (6) writing said remembered data to said 

16 second mass storage system; 

17 (7) whenever new data is written to said 

18 first mass storage system, writing a mirror 

19 copy of said new data to said second mass 

20 storage system. 
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1 15. A method as In claim 14, wherein said first 

2 mass storage system and said second mass storage 

3 system each comprise at least one magnetic disk 

4 drive . 

s 16. A method as in claim 15, wherein said means 

6 for connecting said second mass storage system 

7 comprises a serial network, 
e 

9 17. A method for system restoration in a fault- 

LO tolerant computer system, said computer system 

II comprising: 

iz (A) a first server computer system, 

^3 comprising a first computer executing an 

u operating system; 

■5 (B) a second server computer system, 

I'fi comprising a second computer executing an 

17 operating system; 

1-8 (C) a first mass storage system connected to 

19 said first computer; 

10 (D) a second mass storage system; and 
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V (E) means for connecting said second mass 

2 Storage system to said first comiputer and to 

3 said second computer; 

4 WHEREIN said second computer is initially 

5 unavailable for use, and 

6 WHEREIN said second mass storage system is 

7 initially connected to said first computer, the 

8 method comprising: 

9 (1) on said first computer, detecting said 

0 second computer's availability; 

1 (2) on said first computer, commanding said 
Z operating system of said first computer to 

3 remove said second mass storage system; 

4 (3) setting said means for connecting said 

5 second mass storage system to connect said 
.6 second mass storage system to said second 
.7 computer; 

■8 (4) on said second computer, commanding said 

•9 operating system of said second computer to 

:o scan for mass storage systems such that said 

.1' operating system of said second computer will 
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1 determine that said second mass storage 

2 system is now connected to said second 

3 computer; 

4 (5) reestablishing data mirroring such that 

5 whenever said first computer writes data to 

6 said first mass storage system said first 

7 computer also causes said second computer to 

8 write a mirror copy of said data on said 

9 second mass storage system. 

.0 18. A method as in claim 17, wherein said first 

.1 mass storage system and said second mass storage 

-2 system each comprise at least one magnetic disk 

.3 drive. 

.4 19. A method as in claim 18, \irtierein said means 

.5 for connecting said second mass storage system 

.6 comprises a serial network. . 

.7 20. A method as in claim 17 wherein said 

.8 operating system is the SFT-III operating system. 

.9 21. A method as in claim 20 wherein steps (1), 

^0 (4) and (5) are performed by a NBIWftRE loadable 

:i module . 
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2 22. A method for rapid failure recovery in a 

3 fault-tolerant computer system r said computer 

4 system comprising: 

5 (A) a first server computer system, 

6 con^rising a first computer executing an 

7 operating systan; 

8 (B) a second server coH5)uter system, 

9 comprising a second computer executing an 
.0 operating system; 

.1 (C) a first mass storage system connected to 

.2 said first computer; 

.3 (D) a second mass storage system; and 

4 (E) means for connecting said second mass 

-5 storage system to said first computer and to 

.6 said second computer; 

-7 WHEREIN whenever said first computer writes 

.8 data to said first mass storage system, said second 

■9 computer writes a mirror copy of said data to said 

>o second mass storage system, 

:i the method comprising the steps oft 
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1 (1) detecting a failure of said second 

2 computer; 

3 (2) discontinuing causing said writing of 

4 said mirror copy on said second mass storage 

5 system; 

6 (3) remembering data written to said first 

7 mass storage system but not written to said 

8 second mass storage system; 

9 (4) configuring said second mass storage 

.0 system to record information from said first 

.1 computer; 

.2 (5) writing said remembered data to said 

.3 second mass storage system; and 

.4 ( 6 ) . whenever new data is written to said 

3 first mass storage system, writing a mirror 

-6 copy of said new data to said second mass 

.7 storage system. 

.8 

.9 23. A method for system restoration in a fault- 

:0 tolerant computer system, said computer system 

;i comprising: 
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L (A) a first server cos^uter system, 

2 comprising a first computer executing an 

3 operating system; 

4 (B) a second server computer system, 

5 comprising a second computer executing an 

6 operating system; 

7 (C) a first mass storage syston connected to 

8 said first computer; 

9 {0) a second mass storage system; 

0 (E) means for connecting said second mass 

.1 storage system to said first computer and to 

.2 said second conputer; 

3 VIHEREIK said second oomputer is initially 

.4 unavailable for use; and 

.5 WBEREIK said second mass storage syston is 

.6 initially configured to record information from 

.7 said first computer, 

.8 the method comprising the steps of: 

.9 (1) detecting said second computer's 

:o availability; 
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(2) reconfiguring said second mass storage 
system to record information from said second 
computer; 

(3) establishing data mirroring such that 
whenever said first computer writes data to 
said first mass storage system, said second 
computer writes a mirror copy of said data on 
said second mass storage system. 

24. A method for rapid failure recovery and 
system restoration in a fault-tolerant computer 
system, the method comprising the steps oft 
(1) obtaining a conputer system, the 
computer system comprising: 

(A) a first server computer system, 
comprising a first conputer executing an 
operating system; 

(B) a second server computer system, 
ccnnprising a second computer executing an 
operating system; 
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1- (C) a first mass storage system 

2 connected to said first computer; 

3 (D) a second mass storage system; and 

4 (E) means for connecting said second 

5 mass storage system to said first 

6 computer and to said second computer; 

7 (2) operating said computer system such that 
s absent a fault, whenever said first computer writes 
9 data to said first mass storage system, said second 

0 computer vrrites a mirror copy of said data to said 

1 second mass storage system; 

2 (3) detecting a failure of said second 

3 computer; 

4 (4) discontinuing causing said writing of 

5 said mirror copy on said second mass storage 

6 system; 

7 (5) remembering data written to said first 

8 mass storage system but not written to said second 

9 mass storage system; 
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(6) configuring said second mass storage 
system to record infonnation from said first 
computer; 

(7) writing said remembered data to said 
second mass storage system; 

(8) whenever new data is written to said 
first mass storage system, writing a mirror copy of 
said new data to said second mass storage system; 

(9) detecting said second computer's 
availability; 

(10) reconfiguring said second mass storage 
system to record information from said second 
ccaiqputer; 

(11) reestablishing data mirroring such that 
whenever said first computer writes data to said 
first mass storage system, said second computer 
writes a mirror copy of said data on said second 
mass storage syston. 
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DISCONTINUE MIRROR COPY ON 
OTHER COMPUTER SYSTEM 



303 



REMEMBER DATA NOT MIRRORED 
ON OTHER COMPUTER SYSTEM 



'304 



DISCONNECT MASS STORAGE FROM 
OTHER PROCESSOR AND CONNECT 
TO THIS PROCESSOR 



'305 



COMMAND OPERATING SYSTEM TO 
SCAN FOR NEWLY-CONNECTED MASS 
STORAGE FROM OTHER PROCESSOR 



'306 



WRITE NON-MIRRORED DATA ON 
NEWLY-CONNECTED MASS STORAGE 
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BEGIN SINGLE PROCESSOR OPERATION. 
MIRRORING ALL DATA 



FIG. 3 
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REMOVE MASS STORAGE 
SYSTEM ORIGINALLY ON 
SECOND PROCESSOR 



403 



CONNECT MASS STORAGE 
SYSTEM TO SECOND 
PROCESSOR 



SCAN FOR NEWU-CONNECTED 
MASS STORAGE SYSTEM 



405 



REESTABLISH TWO PROCESSOR 
DATA MIRRORING 



FIG. 4 
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